Similarity Join Algorithms: An Introduction

نویسنده

  • Wei Wang
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Dimension Order: A Generic Technique for the Similarity Join

The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The similarity join combines two point sets of a multidimensional vector space such that the result contains all point pairs where the distance does not exceed a given Parameter ε. Although the similarity join is clearly CP...

متن کامل

An Efficient Similarity Join Algorithm with Cosine Similarity Predicate

Given a large collection of objects, finding all pairs of similar objects, namely similarity join, is widely used to solve various problems in many application domains.Computation time of similarity join is critical issue, since similarity join requires computing similarity values for all possible pairs of objects. Several existing algorithms adopt prefix filtering to avoid unnecessary similari...

متن کامل

Supporting KDD Applications by the k-Nearest Neighbor Join

The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Well-known are two types of the similarity join, the distance range join where the user defines a distance threshold for the join, and the closest point query or k-distance ...

متن کامل

Indexsupported Similarity Join on Graphics Processors

The similarity join is an important building block for similarity search and data mining algorithms. In this paper, we propose an algorithm for similarity join on Graphics Processing Units (GPUs). As major advantages GPUs provide extremely high parallelism combined with a high bandwidth in data transfer to main memory. To exploit these advantages for similarity join, we propose an index structu...

متن کامل

A Cost Model and Index Architecture for the Similarity Join

The similarity join is an important database primitive which has been successfully applied to speed up data mining algorithms. In the similarity join, two point sets of a multidimensional vector space are combined such that the result contains all point pairs where the distance does not exceed a parameter ε. Due to its high practical relevance, many similarity join algorithms have been devised....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008